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Abstract 

The accumulation of adaptive mutations is essential for survival in novel environments. However, in clonal populations with 
a high mutational supply, the power of natural selection is expected to be limited. This is due to clonal interference - the 
competition of clones carrying different beneficial mutations - which leads to the loss of many small effect mutations and 
fixation of large effect ones. If interference is abundant, then mechanisms for horizontal transfer of genes, which allow the 
immediate combination of beneficial alleles in a single background, are expected to evolve. However, the relevance of 
interference in natural complex environments, such as the gut, is poorly known. To address this issue, we have developed 
an experimental system which allows to uncover the nature of the adaptive process as Escherichia coii adapts to the mouse 
gut. This system shows the invasion of beneficial mutations in the bacterial populations and demonstrates the 
pervasiveness of clonal interference. The observed dynamics of change in frequency of beneficial mutations are consistent 
with soft sweeps, where different adaptive mutations with similar phenotypes, arise repeatedly on different haplotypes 
without reaching fixation. Despite the complexity of this ecosystem, the genetic basis of the adaptive mutations revealed a 
striking parallelism in independently evolving populations. This was mainly characterized by the insertion of transposable 
elements in both coding and regulatory regions of a few genes. Interestingly, in most populations we observed a complete 
phenotypic sweep without loss of genetic variation. The intense clonal interference during adaptation to the gut 
environment, here demonstrated, may be important for our understanding of the levels of strain diversity of £ coii 
inhabiting the human gut microbiota and of its recombination rate. 
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Introduction 

Mutation is the fuel of evolution and beneficial mutations the 
driver of organismal adaptation. When a small group of organisms 
founds a new niche it will rely on de novo mutations to adjust to 
such novel environment. If the rate of emergence of new beneficial 
mutations is low, adaptation will proceed through the asynchro- 
nous accumulation of such mutations, one at a time. This will 
result in short-term polymorphism, during which the frequency of 
the beneficial mutation will rapidly change. Such strong selective 
sweeps will purge linked neutral variation, a phenomenon long 
recognized as the signature of selective sweeps in bacterial 
populations [1]. Thus, for reasonably small populations and for 
low mutation rates, population mean fitness will increase in steps, 
where at each step neutral variation becomes completely depleted. 
The adaptive walk will therefore proceed by discrete movements 
along the fitness landscape. 

Over the years studies of microbial adaptation in different 
environments have repeatedly detected deviations from this simple 
pattern [2-6]. It is now much more commonly accepted that, in 



reasonably large microbial populations, many distinct adaptive 
mutations may arise and compete for fixation. The pattern of 
microbial adaptation has therefore been supportive of an 
evolutionary mechanism described in the sixties - the HUl- 
Roberstoii effect [7]. This has been coined clonal interference 
(CI) in the context of clonal bacterial populations (reviewed in [8]). 
This effect is theoretically expected to limit the speed of adaptation 
in asexual versus sexual populations [9]. A great number of 
beneficial mutations are lost and the distribution of mutations that 
fix can be greatly affected by interference [8,10,11]. 

But how strong is CI and what consequences does it entail? 
While it has been shown to occur in bacteria [5,12] and eukaryotes 
[3,13] in laboratory settings, its relevance in natural environments 
is poorly known. Interestingly, CI has recently been inferred to be 
an important determinant of the evolution of the influenza virus 
[14]. Several examples from the study of rapid adaptation in 
natural populations have also shown the violation of the classical 
hard selective sweep model as well as the assumption of the 
mutation limited regime of adaptation. In contrast multiple 
adaptive alleles at the same locus can sweep through the 
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Author Summary 

Adaptation to novel environnnents involves the accumu- 
lation of beneficial mutations. If these are rare the 
process will proceed slowly with each one sweeping to 
fixation on its own. On the contrary if they are common 
in clonal populations, individuals carrying different 
beneficial alleles will experience intense competition 
and only those clones carrying the stronger effect 
mutations will leave a future line of descent. This 
phenomenon is known as clonal interference and the 
extent to which it occurs in natural environments is 
unknown. One of the most complex natural environ- 
ments for £ coli is the mammalian intestine, where it 
evolves in the presence of many species comprising the 
gut microbiota. We have studied the dynamics of 
adaptation of £ coli populations evolving in this relevant 
ecosystem. We show that clonal interference is pervasive 
in the mouse gut and that the targets of natural selection 
are similar in independently £ coli evolving populations. 
These results illustrate how experimental evolution in 
natural environments allows us to dissect the mecha- 
nisms underlying adaptation and its complex dynamics 
and further reveal the importance of mobile genetic 
elements in contributing to the adaptive diversification 
of bacterial populations in the gut. 



populations at the same time, a plienomenon known as soft 
sweeps. These alleles can emerge from de novo mutation or from 
standing genetic variation (see [15] for a revision). The phenom- 
enon of CI is theoretically expected to impact the dynamics of 
adaptation if the effective population size (jVJ and/or the rate of 
occurrence of beneficial mutations (Uj) is large. More specifically, 
when the number of competing mutations is bigger than one, that 
is if 2JVg Ui, Ln(jV,, Sj,/2) >1 (where Sj, is the mean effect of a 
beneficial mutation [8]). Desai and Fisher [16] made the case that 
in extremely large populations with very high mutational inputs, 
haplotypes with multiple beneficial mutations are expected to arise 
and increase in frequency. Furthermore, important advances of 
the theory of CI have also recently been made [11,17,18]. Since 
the parameter values important to determine the importance and 
level of interference are expected to be dependent on the 
environment, the relevance of CI for bacterial evolution in natural 
conditions remains to be demonstrated. 

E. coli K-12 has been an important model organism in many 
fields of biology, since its isolation from human feces in 1922 [19]. 
Accumulation of mutations during its adaptation to the gut has 
been observed [20-22], but the fitness landscape characteristic of 
this environment as well as the extent of interference E. coli 
experiences is not known. 

We have studied the process of accumulation of beneficial 
mutations and the strength of their effects in a natural 
environment of E. coli, the mouse gut. We found that CI is 
pervasive in vivo and described the genetic basis of the initial steps 
of adaptation, which were observed to exhibit a striking 
parallelism. Remarkably we have found that in the same 
population, distinct mutations with equivalent functional effects 
(i.e., targeting the same gene or operon) reach detectable 
frequencies simultaneously. This leads to the occurrence of a 
phenotypic hard sweep without loss of variation at the genetic 
level. However, most of these mutations get extinct after a few 
generations, a signature of soft sweeps [23]. 



Results and Discussion 

Dynamics of adaptation of £ coli through changes in the 
frequency of a neutral marker 

A genetically homogeneous population of E. coli, except for a 
chromosomaUy encoded fluorescence marker (see Material and 
Methods), was used to trace the occurrence of different adaptive 
mutations [2] and determine the strength of their fitness effects. 
This population, composed of equal amounts of two subpopula- 
tions expressing either a yellow (YFP) or cyan (CFP) fluorescent 
protein, was used to inoculate inbred mice orally. Subsequently, 
we followed fluorescent-markers frequency from daily collected 
fecal samples, for 24 days. The experiment was repeated three 
times to a total of 15 mice housed individually, where E. coli 
adaptation was followed. 

The adaptive dynamics observed in 15 independendy evolving 
populations are shown in Figure lA (see also Figure SI for the E. 
coli loads over time). Some of the dynamics observed, exhibit a 
typical signature of CI. An initial increase in frequency of one 
neutral marker, followed by replacement by the subpopulation 
bearing another (e.g. Figure lA, population 1.9) was diagnostic of 
CI. Presumably, the markers hitchhike with successions of 
beneficial mutations [2]. We also observed cases in which the 
frequencies of the markers remained stable, (Figure lA, popula- 
tions 1.6 and 1.8), for up to 24 days. This time corresponds to 
approximately 400 generations, if we take the 80 minute 
generation time estimated in the gut [24] . Stable marker dynamics 
are expected under two completely opposing scenarios: neutral 
evolution due to the lack of occurrence of adaptive mutations, or 
very intense CI caused by strong mutations of similar effect 
occurring in the different fluorescence backgrounds simultaneous- 
ly. To distinguish between these scenarios and to further show that 
strong adaptive mutations are occurring, we performed in vivo 
competitive fitness assays of several clones with the ancestral strain. 
We tested clones from populations where a clear signal of 
adaptation was detected (populations 1.12 and 1.13 in Figure lA) 
or was not (1.6 and 1.8 in Figure lA). All clones tested were found 
to carry beneficial mutations (Figure IB). As expected, the clones 
isolated after a clear sweep (Figure IB, populations 1.12 and 1.13), 
showed a fitness increase. Importantly, clones sampled from the 
populations in which only slight changes in marker frequency were 
detected also showed a fitness increase (Figure IB, populations 1.6 
and 1.8). This demonstrates that the evolution process involved 
multiple adaptive mutations competing for fixation. Consistent 
with the small changes in marker frequency observed, the 
competitive fitnesses were similar for clones isolated from the 
same evolving population, but bearing distinct neutral markers 
(Mann-Whitney test P=0.7 population 1.6, P=0.4 population 
1.8). 

The distribution of effects of mutations that contributed 
to adaptation 

From the dynamics of neutral markers one can try to estimate 
the important evolutionary parameters of the adaptive process: 
mutation rate and selective effects [2,25-27]. One method that has 
been used [25] summarizes neutral marker dynamics in replicate 
experiments in two statistics: i) the time it takes for one of the 
markers to first change significantly in frequency and ii) the slope 
associated with that frequency change. The time for the first 
marker deviation corresponds to the time where the natural 
logarithm of the marker ratio significantly changes from the initial 
ratio. The slope associated with a frequency change is the slope of 
the linear regression of the natural logarithm of the ratio between 
the two markers after a significant frequency change is detected. 
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Figure 1 . Evidence for rapid adaptation and CI in vivo. A) Dynamics of marker frequency (with 95% confidence intervals) during the adaptation 
of £ coli to the gut upon colonization (populations 1.1 to 1.15). The predictions of the simplest model of Darwinian selection [22], for each set of data 
points are shown as lines. The lines correspond to the model that assumes multiple beneficial mutations (/= 1,2,5) can occur in a given clone at a 
given time (Tb,), and these clones have a given fitness (l/V,). Tb, and l/V, are fitted by maximum likelihood and the best model, in terms of number of 
mutations, is chosen according to Akaike criteria. Representative examples of trajectories for the classical signature of a selective sweep (populations 
1.5, 1.12 and 1.13) and for the maintenance of neutral diversity under with intense CI (populations 1.6, 1.8 and 1.9) are shown in colours. B) Direct 
evidence for adaptation and CI: the bars represent the mean selection coefficient (±2 s.e.m, n = 3) from an in vivo competitive fitness assay between 
evolved and ancestral clones. The first bar shows the neutrality of the fluorescent marker. The following six bars represent the results from 
competitions of a mixture of thirty clones (with a given fluorescent marker, indicated below the bar) isolated from the respective population 
(indicated below each bar). Both CFP and YFP mixtures of clones from lineages 1 .6 and 1 .8 show similar levels of adaptation, consistent with intense 
CI maintaining a high frequency of both neutral markers. The dashed bars show the results of competitions (n = 2) of single clones isolated from 
lineage 1.12. 

doi:1 0.1 371/journal.pgen.1 0041 82.g001 



These summary statistics lead to estimation of two efTective 
evolutionary parameters {U^ and s^) [2] . Under intense CI C/, (the 
effective genomic mutation rate) will be an underestimate of f/j, 
and (the effective selection coefficient of beneficial mutations) an 
overestimate of [27]. For the dynamics in Figure 1, the summary 
statistics imply ~7xlO and ,r, ~0.075. 

We also have used a recently developed method [26] to estimate 
the fitness of the haplotypes that segregate at sufficiently high 
frequency to lead to changes in marker dynamics. This method 
takes the frequencies of the neutral markers across all the time 
points of the experiment and allows for the occurrence of CI. We 
started by estimating the distribution of fitness effects of the 
minimum number of haplotypes, as proposed by lUingworth and 
Mustonen [26], to fit the marker frequency data. Assuming the 
simplest possible model of Darwinian selection, with constant 
selection across space and time (admittedly an oversimplified view 
of the gut), we fitted a series of models, which differ in the number 
of beneficial mutations. Starting from the simplest minimal model 
where only one beneficial mutation occurs, we sequentially 
increase the number of mutations until a maximum of five. For 
each model this approach fits, by maximum likelihood, the 
observed frequency changes at a marker locus with two alleles (our 
fluorescent alleles) and then chooses the simplest possible model. 
The predictions described faithfully the empirical data in terms of 
marker frequencies (lines in Figure lA). With this method the 
distribution of haplotype fitnesses across all populations, can be 
determined (Figure 2). The inferred fitness effects of segregating 
haplotypes are quite large with a mean of (15%) and some 
haplotypes leading to increases in fitness of more than 30%. We 
note that given the intense CI observed, this approach misses some 
of the mutations contributing to the dynamics (see below). 



Genetic basis of the initial adaptations to the mouse gut 

To characterize genetically the first steps of the adaptive process 
in vivo, we performed whole genome sequencing (WGS) of the 
ancestral and independently evolved clones. Each evolved clone 
was isolated from a different mouse after colonization for 24 days 
(~400 generations). The mean number of mutations per evolved 
clone was 2.3, with a minimum of 1 and a maximum of 4 (Figure 3 
and Table S2, see also Table SI). 

The genetic analysis showed that similar and parallel adaptive 
paths were taken during the initial adaptation to the gut (Figure 3, 
Table S2). The most striking recurrent event was detected at the 
level of mutations in the gat operon, which occurred in all the 
sequenced clones (79% IS insertions and 21% small indels). The 
gat operon consists of six genes that collectively allow for galactitol 
metabolism [28]. We found that galactitol had an inhibitory effect 
on the ancestral strain when grown in minimal medium with 
glycerol (Figure S2). However in all the evolved clones (carrying 
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Figure 2. Distribution of fitness effects of beneficial haplotypes 
that contributed to adaptation. The fitness of beneficial haplotypes 
(o)/,) was estimated under a theoretical model which assumes the 
minimum number of beneficial mutations required to explain the 
marker dynamics. 

doi:1 0.1 371/journal.pgen.1 0041 82.g002 



PLOS Genetics | www.plosgenetics.org 



3 



March 2014 | Volume 10 | Issue 3 | e1004182 



Clonal Interference in the Gut 



dcuB (reg) 



srlR 



gat operon 



focA (reg) 



3X|-^ 



dcuB ~> 



CRP 



srIR > 



y Insertion 1 bp 
T IS insertion 
i Deletion 1-2bp 
A Deletion ~5kb 
I SNP 

" Duplication 



3x 



gate 



gaM gafS 







2x 

T 


1 focA y 



V 



Figure 3. The genetic basis of adaptive mutations and the level of parallelism between populations. Identified mutations in clones 
isolated from populations 1.1 to 1.14 (evolved in vivo for 24 days), represented along the E. coli chromosome. For simplicity, the genomes are 
represented linearly and vertically drawn. The type and position of mutations are shown by triangles for insertions and deletions, small vertical bars 
denote single nucleotide polymorphisms (SNPs), and one duplication in clone number 1 .1 2 is depicted as a horizontal bar. See the symbol legend for 
other events. The genes dcuB, srIR and focA and one operon (gat) are highlighted. These represent regions of parallel mutation in at least two 
genomes. The genomic context of these mutations is represented on the right, (reg) after the gene name, means that the regulatory region, rather 
than the coding region, was affected. Numbers above marked mutations represent the number of times a particular mutation was detected at the 
same position. 

doi:1 0.1 371/journal.pgen.1 0041 82.g003 



mutations either in the coding region of gatA, gatC, gat^ or in tlie 
regulatory region of gatY (see Figure 3)) this inhibition was 
surpassed (Figure S2). AH these clones exhibited a ^fli-negative 
phenotype, inability to metabolize galactitol. Since galactitol is 
part of the host' metabolism of galactose, E. coli might be 
frequently exposed to this compound in the gut. This may have 
selected for the ^fl/-negative phenotype. 

Parallelism at the gene level was also observed: four clones 
carried mutations in srlR, which is a DNA-binding transcription 
factor that represses the srl operon involved in sorbitol metabolism 
[29]. At least two of the mutations inactivated the gene, since these 
caused a frameshift and a stop codon (Table S2). The inactivation 
oi srlR leads to the constitutive expression of the srl operon [29], 
thus these mutants are expected to readily consume sorbitol, even 
at low concentrations. This might represent an advantage when 
facing limiting nutrients and strong competition, compatible with 
the hypothesis that the ability to use several limiting carbon 
sources is an important advantage for E. coli inhabiting the 
mammalian gut [30,31]. 

Two other parallelisms in the regulatory regions of dcuB (three 
clones) and focA (two clones) were observed. Both of these genes 
encode transmembrane transporters whose expression occurs 
almost exclusively in anaerobic conditions. The Dcu system 
mediates the uptake, exchange and efflux of C4-dicarboxylic acids 
[32]; fumarate is a C4-dicarboxylate acid and is the most 
important anaerobic electron acceptor for E. coli in the intestine 
[33]. Additionally, yoc^4 is a putative transporter of formate [34] 
and is the "signature compound" of £. coli anaerobic metabohsm 
[35]. During anaerobic growth, pyruvate is cleaved into formate 
that is subsequently used as a major electron donor for the 
anaerobic reduction of fumarate [36] and nitrate or nitrite [37]. 
Hence, given their roles in anaerobic respiration it would be 
expected that dcuB and focA are under selective pressure in the gut 



and that the mutations observed may be important for regulating 
anaerobic respiration of E. coli in the gut. 

Interestingly, one clone carried a non-synonymous mutation in 
arcA, a global regulator that governs respiratory flexibility of E. coli. 
arcA allows controlled switching between aerobic and anaerobic 
respiratory genes during fluctuating oxygen conditions, a trait 
crucial for E. coli to efficiently compete with the gut flora composed 
essentially of anaerobes [33,38]. 

Previous studies have found increased expression of dcu, gat and 
srl regulons during growth of is. coli on mucus [20]. Together with 
our fmdings of mutations on these genes, suggests that these 
regulons are under strong selection. 

One common adaptation in E. coli reported to occur in the 
mouse gut is decreased motility [21,31,39], which typically occurs 
in strains that are hypermotUe as a result of an IS element 
upstream of the 7?AZ)C operon. Since our strain is not hypermotile, 
it is not surprising that mutations in this region were not observed 
in our study. 

Taken together, our genetic analyses revealed that the initial 
adaptive steps oiE. coli to the mouse gut involved gene inactivation 
or modulation through IS elements. In terms of the relative 
contribution of regulatory versus coding regions to adaptation 
[40,41], we found that half of the IS insertions occurred in 
regulatory regions and one third of all mutations were located in 
these regions. We therefore observe that both changes in 
regulatory and coding regions contributed to E. coli adaptation 
to the gut. We also note that the first step of adaptation involved 
loss of function mutations (namely in the gat operon), which 
supports the recent view that nuU mutations may play an 
important role in early adaptation to new environments [42]. 

Previous studies [4,43] focusing on evolution of E. coli 
populations in glucose-limited chemostats, a device used to 
simulate the human gut [44], during a similar period of time (26 
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Figure 4. Emergence and spread of beneficial mutations in tlie gat operon. Dynannics of frequency change of the gat-negative phenotype 
over time are shown for all populations (1.1 to 1.15). Inset: The natural logarithm of the ratio of gaf-negative individuals to wild type over the first 5 
days of adaptation is shown as dots. Each group of points was fitted to a linear regression (represented as lines). Highlighted in bold is the population 
1.12 for which the slope corresponds to an estimate of the selection coefficient of 0.075±0.01 (per generation). 
doi:1 0.1 371 /journal.pgen.1 0041 82.g004 



days), found a high degree of paralleK.sm just as we observe here. 
Furthermore extensive phenotypic diversity was observed [43] and 
whole genome sequencing identified similar numbers of substitu- 
tions and a contribution of IS elements to the adaptive process 
[45] . However the targets of selection in glucose-limited chemo- 
stats were very different from those found during its adaptation to 
the gut, reflecting the distinct specific selective pressures E. coli is 
subjected to in the different environments. 

Phenotypic hard sweeps and genetic soft sweeps at the 
gat operon 

Given the striking parallelism that occurred in the gat operon, 
through knockout mutations, we sought to determine the timing of 
its emergence in each of the populations. We followed the 
dynamics of the ^aZ-negative phenotype during adaptation 
(Figure 4). In eight populations the advantageous mutation could 
be detected just 2 days post-colonization. In one population an 
extraordinarily rapid phenotypic sweep could be seen: a popula- 
tion where initially the majority of clones were ^aZ-positive changed 
completely its phenotype within 2 days (Figure 4, population 1.5), 
showing the strong benefit associated with this phenotype. 

In most populations the increase in frequency of the ^ai-negative 
phenotype was followed by a change in marker frequency (Figure 
S3A), consistent with the expected dynamics of a strong beneficial 
mutation occurring in a given fluorescent background sweeping to 
fixation. However in some populations fixation of the ^ai-negative 
phenotype was observed without fixation of the neutral marker 
(Figure S3B). This suggests that independent mutations causing 
the same phenotype have occurred, i.e. clonal competition 
emerging from multiple alleles at the same locus with similar 
selective effects is driving these dynamics. 

We next sought to estimate the strength of selection associated 
with new beneficial alleles that emerge and escape stochastic loss 



from the initial change in frequency of the ^aZ-negative phenotype. 
If we take population 1.12, where only one allele was detected in 
the gat operon, we can estimate the selective advantage conferred 
by this mutation {sgat) from the slope of the Ln(x/(l-x)) along time, 
where x is the frequency of the ^aZ-negative phenotype. When 
doing so we find that j^^, = 0.07 (±0.01) (Figure 4 inset). In the 
other populations the strength of selection is more difficult to 
estimate due to the emergence of many distinct alleles, which leads 
to an underestimate of their beneficial effect due to CI. 

To determine directiy the selective advantage gat alleles we 
performed in vivo competitions against the ancestor. As shown in 
Figure S4, the mutant gat^ (clone 4YFP) had, on average, an 
advantage of 0.065 (±0.013) over the ancestral, which is 
compatible with that inferred from the dynamics of the gat- 
negative phenotype. 

To gain further insight into the regime of interference detected 
by the change in frequency of the neutral markers, we studied 
polymorphism levels of four of the adapting populations. We 
followed the change in frequency of haplotypes at diflerent points 
in time in two populations that showed a rapid change in marker 
frequency (populations 1.5 and 1.12) and in two other populations 
where the neutral marker was kept polymorphic throughout the 
period studied (1.1 and 1.11) (Tables S3 to S6). Each haplotype 
was the combined genotype at 7 selected loci: gatY, gatZ, gatA, gatC, 
srlR, dcuB and focA. These loci were previously observed to be 
mutational targets in two or more of the sequenced independentiy 
evolved clones (see Fig. 3), which we interpret as important players 
in adaptation to the gut, and thus likely exhibiting segregating 
polymorphisms in most populations. We note that we determined 
the haplotype structure of a sample of clones (between 20 and 40 
clones per time point, per population) based on the mutations 
found in the sequenced clones, that is, we looked for IS insertions 
in gatY, gat^ gatA, gatC, dcuB and focA, and SNPs or small iiidels in 



PLOS Genetics | www.plosgenetics.org 



5 



March 2014 | Volume 10 | Issue 3 | el 0041 82 



Clonal Interference in the Gut 




108 126 



306 



432 0 



108 



198 



306 



432 



Figure 5. Graphic representation of the frequencies of newly generated haplotypes along 24 days (corresponding to 432 
generations) of evolution inside the mouse gut (see Tables S3 to S6 for numeric data). Shaded areas are proportional to the relative 
abundance of each haplotype. Yellow and blue shaded areas represent the two sub-populations of bacteria labeled either with cfp or yfp alleles. The 
ancestry relations between haplotypes can be inferred by the accumulation of new mutations in a previously existent genotype. Dash lines mark the 
time points in days (upper axis) or generations (lower axis) where the sampling took place. For the top two populations 1.1 (A) and 1.11 (B), 40 clones 
were sampled in each time point. For the bottom two populations 1.12 (C) and 1.5 (D) 20 clones were sampled in each time point. 
doi:1 0.1 371 /journal.pgen.1 0041 82.g005 



gat^ gate and srlR (see Methods). The presence of an extra 
mutation, a duplication of ~ 1 50 Kb, was also tested in population 
1.12. 

Figure 5 shows the dynamics of these haplotypes in the 
populations studied (see also Figure S5 for the sums of the 
frequencies of genotypes with possible equivalent phenotypes over 
time). Extensive haplotype diversity is found in all populations but 
to a lesser extent in populations 1.12 and 1.5. For example, in the 
frrst ~100 generations we observe 16 and 8 haplotypes segregating 
in populations 1 . 1 and 1.11, but only 3 and 5 in populations 1.12 
and 1.5, respectively. This observation is concordant with the 
neutral marker dynamics, where populations 1.12 and 1.5 showed 
a rapid sweep of one of the markers while the other two (1.1 and 
1.11) maintained the ratio of these markers closer to 50% for a 
longer period. The first mutation detected, without exception, 
impairs galactitol metabolism, thus surpassing the vital inhibitory 
effect of this compound on E coli growth. We observed multiple 
soft sweeps of mutations, with possibly similar fitness benefit, 
occurring independently in the gat operon and competing for 



fixation. We directly tested for this by performing competitions 
between clones carrying different gat alleles: mutations in gatZ 
against gatC. The results of these competitions show an equivalent 
selective coefficient of at least these two mutants (Figure S6), 
supporting the hypothesis that the different mutations in the gat 
operon present in the populations confer the same phenotype and 
fitness advantage. This provides an explanation for the observed 
coexistence between different alleles at the same locus and the 
phenotypic replacement detected. 

After the mutations in the gat operon secondary adaptive 
mutations occurred and promoted the increase in the frequency of 
haplotypes carrying more than one beneficial mutation. Overall, 
the adaptive process appears non-mutation limiting, allowing 
polymorphism levels within populations to be kept high until the 
end of the period studied. A closer look to the haplotype dynamics 
points to the strong possibility of further mutations occurring 
beyond the ones we have targeted. For example, at generation 126 
in population 1.12, the increase in frequency of the neutral allele 
yfp seems to be caused by some other mutation besides gatC. 
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Furthermore the occurrence of a large duplication in the same 
population appears to cause a beneficial elTect. To gain access to 
further beneficial mutations, we performed whole genome 
sequencing of this (1.12) and two other populations (1.11 and 
1.10) at the end-point of the experiment (see Table S7). This 
procedure aimed at querying the possibility of beneficial mutations 
at other loci significantly contributing to the adaptive process. In 
aU populations we observed multiple mutations in some of the 
targets previously identified (gat operon, srlR and focA) as well as 
several other loci that were targets for mutation unique to a single 
population. The adapti\'(; value of these last events is difficult to 
ascertain since tliesi; might have increased in frequency to 
dctcc table \alues due to linkage with other beneficial mutations 
or just by chance. This analysis allowed uncovering two new 
parallel mutational targets. One of these targets was the regulatory 
region oi jjjP where an IS insertion occurred in population 1.11 
and also on clone 3YFP (from population 1.3) and the other target 
was asnA, where either a deletion or a duphcation of around 50 bp 
was observed in two different populations (1.11 and \A2). yjjP 
codes for an inner membrane structural protein of unknown 
function [46] and asnA codes for one of the two asparagine 
synthetases in E. coli, catalyzing the ammonia-dependent conver- 
sion of aspartate to asparagine [47]. 

Potential role of epistasis in the adaptive process 

Overall, the haplotype analysis found ~6% of triple mutants. 
These were all ^ai-negative and carried either a srlR and dcuB 
mutation or a srIR tindfocA mutation. dcuB andfocA were never 
found in combination, suggesting that they may interact 
epistaticaUy. This can be possibly due to their common 
involvement in anaerobic respiration modulation, thus fulfilling 
related functions. This last observation suggests a role for negative 
epistatic interactions whereby mutations are beneficial individually 
but deleterious when in combination. The fact that repeated 
selection of dcuB or focA mutations was observed, but these 
mutations were never found in the same genetic background, may 
indicate a disadvantage of the double mutant carrying mutations 
that individually are presumably beneficial. Another example of 
possible epistasis is the case of the mutations in the gat operon. 
These occur very rapidly and lead to the inactivation of the gat 
operon, conferring an estimated advantage of approximately 7%. 
It is therefore expected that additional mutations subsequendy 
occurring in other gat genes would not confer a strong or any 
advantage, since the first mutation already resulted in a loss of 
function. 

Test for negative frequency-dependent selection 

Negative frequency dependent selection, i.e., fitness advantage 
of clones when at low frequency and fitness disadvantage when at 
high frequency, is known to lead to maintenance of genetic 
diversity [12,43,48]. Since the gut is a complex environment [49] 
where non-transitive ecological interactions may occur, we tested 
for variation in fitness elfects with initial frequency of evolved 
clones. Negative frequency-dependent interactions could explain 
the long-term maintenance at low frequency of some of the 
fluorescent subpopulations observed in the adaptive dynamics 
(Figure lA, populations 1.5, 1.12 and 1.13). We tested population 
1.13 for this type of selec:tion, where CFP marker increased tiU 
0.95 but never achieved fixation. In Figure S7 we show the results 
of the tested adapted clones in competitive fitness assays at 
different initial frequencies. While we observed that clones have an 
advantage when rare we also obser\'e that such selective advantage 
is present when at high frequency. No disadvantage was found at 
any of the initial frequencies tested, which suggests that negative 



frequency dependent selection is not a major force operating in 
this population. 

No evidence for emergence of mutators 

Our finding that mutations of large beneficial effects can 
accumulate rapidly in vivo suggested to us that mutators could have 
emerged. Mutators may play an important role in bacterial 
evolvability and their emergence typically involves mutations in 
genes of the DNA repair system [50]. Depending on the gene 
mutated a mutator bacteria can acquire an increase in its genome- 
wide mutation rate of 10 up to 10 000 fold [51]. Indeed co- 
colonization of wild-type and mutator bacteria in germ free mice 
[52] has shown that mutators can increase in frequency by 
hitchhiking with beneficial mutations to which they are genetically 
linked to. Furthermore mutators have been observed to sometimes 
emerge de novo in the early stages of adaptation to glucose-limited 
environments in laboratory experiments [53,54]. We therefore 
tested for an increased rate of mutation through fluctuation assays 
of several adapted clones. These tests however provided no 
evidence for significant increases in genomic mutation rate (Figure 
88). 

Conclusions 

In conclusion, we demonstrate here that E. coli MG1655 adapts 
very rapidly to the intestine of streptomycin-treated mice. The first 
steps of adaptation of E. coli to the mouse gut are dominated by 
soft sweeps and adaptive mutations of large effect. We observed a 
regime of intense clonal interference where haplotypes carrying 
more than one beneficial mutation compete for fixation. This is 
the regime assumed under the Fisher-MuUer theory that predicts 
that recombination speeds up adaptation by reducing competition 
between beneficial mutations. This data therefore provides 
support for the Fisher-MuUer effect as an important mechanism 
for the maintenance of homologous gene recombination in 
bacteria [55]. It shows that E. coli can adapt to this complex 
ecosystem very fast: within two days a phenotypic replacement 
could be detected. The first beneficial mutations targeted the same 
gene or operon and transposable elements made a substantial 
contribution to adaptation. These results demonstrate the 
remarkable adaptive potential of bacteria in one of the most 
complex environments of their natural (icolog)' and may have 
important consequences for our understanding of the species and 
strain diversity in the gut microbiota. 

Material and Methods 

Ethics statement 

All experiments involving animals were approved by the 
Institutional Ethics Committee at the Instituto Gulbenkian de 
Ciencia (project nr. A009/2010 with approval date 2010/10/15), 
following the Portuguese legislation (PORT 1005/92), which 
complies with the European Directive 86/609/EEC of the 
European Council. 

£ coli Strains 

All strains used were derived from Escherichia coli K-12, strain 
MG1655 [44]. 

Strains DM08-YFP and DM()9-CFP (MG1655, galK-TFP/CFP 
amp (PZJ2), str^ [rpsllSO), AladZfA) were used in the initial 
colonization experiment. These strains contain the yellow (yfp) or 
cyan (cfp) fluorescent genes linked to amp'*' in the galK locm under 
the control of a lac promoter and were obtained by PI 
transduction from previously constructed strains [2]. To ensure 
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constitutive expression of the fluorescent proteins the lac operon 
was deleted using the Datsenko and Wanner method [56]. 

Fluorescent marker dynamics during mouse colonization 

To study E. colt adaptation to the gut we used a streptomycin- 
treated mouse colonization model [49]. Briefly, 6- to 8-week old 
C57BL/6 male mice raised in specific pathogen fi-ee (SPF) 
conditions were given autoclaved drinking water containing 
streptomycin (5 g/L) for one day. After 4 hours of starvation for 
water and food the animals were gavaged with 100 |j,l of a 
suspension of lO" colony forming units (CFUs) of a mixture of 
YFP- and CFP-labeled bacteria (ratio 1:1) grown at 37°C in brain 
heart infusion medium to ODggo of 2. After gavage, the animals 
were housed separately and both food and water containing 
streptomycin were returned to them. Mice fecal pellets were 
collected for 24 days, diluted in PBS and plated in Luria Broth 
agar (LB agar). Plates were incubated overnight and the 
frequencies of CFP- or YFP-labeled bacteria were assessed by 
counting the fluorescent colonies with the help of a fluorescent 
stereoscope (SteREO Lumar, Carl Zeiss). Plating the fecal samples 
allowed checking for morphological [)lK'not\pi(' di\-crsit)' of the 
colonies, a phenomenon reported previously in previous studies of 
adaptation of £. coli to the mouse gut [20-22,31,39]. However, no 
consistent changes in colony morphology were observed during 
the evolution experiments. 

A sample of each collected fecal pellet was daily stored in 15% 
glycerol at — 80°C for future experiments. 

For the initial colonization a group of 5 mice was gavaged and 
foUowed for 24 days, and this procedure was repeated for two 
more groups of 5 mice. Thus a total of 15 mice were analyzed 
(Figure lA and SI). 

Competitive fitness assays in vivo to test for adaptation 

We measured the relative fitness in vivo by bacterial clones 
isolated from mouse fecal samples after 24 days of colonization 
(approximately 400 generations). Individual colonies or mixtures 
of approximately 30 colonies with the same fluorescent marker 
(population of clones) were picked from mouse fecal platings, 
grown in 10 ml of Luria Broth medium (LB) supplemented with 
ampiciUin (100 (Xg/ml) and streptomycin (100 Hg/ml) and stored 
in 15% glycerol at -80°C. 

The isolated clones were competed against the ancestral strain 
labeled with the opposite fluorescent marker, at a ratio of I to 1, in 
1-2 day co-colonization experiments following the same proce- 
dure described for the evolution experiments. The selective 
coefficient (fitness gain) of these clones in vivo (presented in 
Figure IB) was estimated as: 

where Sf, is the selective coefficient of the evolved clone, Rfe^/anc o-^d 
Riev/anc are the ratios of evolved to ancestral bacteria in the end (J) 
or in the beginning [i] of the competition and t is the number of 
generations per day. We assume <= 18, in accordance with the 80 
minute generation time estimated in previous studies on E. coli 
colonization of streptomycin-treated mouse [24,49,57]. 

Estimation of rate of beneficial mutations 

To infer the effective evolutionary parameters [Ug and we 
used the method described in [25], available online at http:// 
barricklab.org/twiki/bin/view/Lab/ToolsMarkerDivergence. 
This procedure involves three main steps: simulate families of 



marker trajectories at many combinations of mutation rate [U] and 
selection coefficient {s), summarize the simulated and experimental 
marker trajectories in the statistics Te and Me (that represent the 
time of divergence of marker frequency and the rate of divergence, 
respectively); compare the distributions of and otg of the 
simulated dynamics and the experimental data and find the values 
of U and s that best explain the experimental marker dynamics. 

Using the model described in [25] we simulated sets of 100 
replicate populations evolved under different parameter combi- 
nations of U and s. U ranged from 1 0 " to 1 0 ' (with increments 
of 10"", 10"', 10"'', 10"-^ and 10"*) and j ranged from 0.07 to 
0. 1 (with increments of 0.005). We assumed a population size of 
lO', based on the weight of feces per day (1.5±0.12 (2 s.e.m.) 
grams) that a mouse produces, on the observed number of CFTJs 
(per gram of feces) along the experiments (which remained stable 
over the length of the experiment around lO", see Figure SI), and 
on the number of generations per day, which is 18 [24]. A 
population size of the same order of magnitude was measured by 
Leatham-Jensen et al. [58] in the intestinal mucus, where the 
majority of bacterial division occurs [49] . 

This simulated data consist of the logarithm of the ratio of the 
two marker frequencies (Rf = frrAi^)/ ifcpA^^J) several time 
points (ti), (ln(Rf(ti))). 

We then used both the simulated and experimental marker 
dynamics as input in the program marker_divergence_fit.pl, that 
summarizes the evolutionary dynamics in two statistics: Te and a^. 
The first is the time, x^, where a significant deviation of ln(Rf(Te)) 
from ln(Rf{/ = 0)) occurs. The second is the rate of change of 
ln(Rf(<)) with time, that is, Te sets the time of di\ ergence of marker 
frequency and the rate of divergence. Each replicate population 
is summarized by a single value of Te and a.^, and the n replicate 
populations (characterized by a given combination (U, S)) result in 
a distribution of T[t,.) and ^(ote). These distributions are then 
compared, using the program marker_divergence_significanc(;.pl, 
to the distributions of Te and Me that summarize- the experimental 
data To(Te) and Ao(ae) using a two-dimensional Kolmogorov- 
Smirnov to test the fit between the simulated data and the pseudo- 
observed data. The combination {U, S) that gives rise to the highest 
P value is taken as and ^e, even when the hypothesis that the 
distributions are different cannot be rejected. 

Estimation of distribution of haplotypes fitness 

To estimate the fitness effects of beneficial mutations that may 
contribute to adaptation we used a maximum likelihood approach 
to infer the number, fitness effect and time of establishment of 
beneficial mutations in an asexual population from individual 
marker frequencies trajectories [26]. This method, which is 
available on web page: http://www.sanger.ac.uk/resources/ 
software /optimist/, makes no a priori assumptions about the 
distribution of mutation effects. It has been tested against 
simulated data and has been shown to be able to: reproduce 
complex trajectories of neutral markers, make accurate estimates 
of the distribution of haplotype fitness effects, and provide good 
estimates of the number of mutations segregating at high 
frequencies, at least for values of mutation rates that are not very 
large. We note that this method may miss many small effect 
beneficial mutations if these indeed occur at high rates. 

It assumes an initial population of cells comprising two 
haplotypes with equal fitness, corresponding to two neutral 
markers. A beneficial mutation occurring in one of the 
backgrounds is represented as a new haplotype, which is assumed 
to exist at a given time (TJ;) and with a frequency of 0.001 in the 
population. Since this new haplotype has increased fitness (14^) 
relative to the initial wild-type population, its frequency wiU 
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increase, leading to an increase in frequency of the fluorescent 
marker population in which the mutation has occurred. The 
simplest model that can be assumed is that where a single 
beneficial mutation occurs. That will imply a given time of 
establishment and effect of the mutation that maximize the 
probability of observing the marker frequency data. The second 
model that is considered is one where two mutations can occur, 
which will lead to a given likelihood of the data. Models assuming 
different numbers of beneficial mutations can thus be compared 
based on their likelihood score using Akaike information criteria 
[26]. With this method we obtain a distribution of haplotype 
fitnesses with the minimal number of mutations that can explain 
the marker frequency dynamics (a)^). 

Whole genome re-sequencing and mutation prediction 

Cbne analysis: After 24 days of gut colonization one clone from 
populations 1.1 to 1.14 and the two ancestors (MG1655-YFP and 
MG1655-CFP) were isolated and used to seed 10 mL of LB (Line 
1.15 was not analyzed since the mouse from this line died at that 
time point). These cultures were then grown at 37°C with 
agitation. Subsequently DNA was isolated following a previously 
described protocol [."59]. The DNA library construction and 
sequencing was carried out by BGL Each sample was pair-end 
sequenced on an Dlumina HiSeq 2000. Standard procedures 
produced data sets of Illumina paired-end 90 bp read pairs with 
insert size (including read length) of ~470 bp. Genome sequenc- 
ing data have been deposited in the NCBI Read Archive, http:/ / 
www.ncbi.nlm.nih.gov/sra (accession no. SRPO 17347). Mutations 
were identified using the BRESEQ_ pipeline [60]. To detect 
potential duplication events we used ssaha2 [61] with the paired- 
end information. This is a stringent analysis that maps reads only 
to their unique match (with less than 3 mismatches) on the 
reference genome. Sequence coverage along the genome was 
assessed with a 250 bp window and corrected for GC% 
composition by normalizing by the mean coverage of regions 
with the same GC%. We then looked for regions with high 
differences (>1.4) in coverage. Large deletions were identified 
based on the absence of coverage. For additional verification of 
mutations predicted by BRESEQ^ we also used the software IGV 
(version 2.1) [62]. 

Ancestal genome: The sequence reads from MG1655 were mapped 
to the reference strain [63]. The two ancestors carried the mutations 
listed in Table SI. The mutations underlined were present in the 
YFP ancestor but not in the CFP. The sequences of the 14 
sequenced clones were then interrogated against this ancestral 
genome and the mutations identified are listed in Table S2. 

Population analysis: DNA isolation was obtained in the same way 
as described above for the clone analysis except that instead of one 
clone per population a mixture of >1000 clones per population 
was used. Three populations were sequenced: 1.10, 1.11 and 1.12. 

The DNA library construction and sequencing was carried out 
by the IGC genomics facility. Each sample was pair-end 
sequenced on an lUumina MiSeq Benchtop Sequencer. Standard 
procedures produced data sets of Illumina paired-end 250 bp read 
pairs. The mean coverage per sample was as ~260x for 
population 1.10, ~160x for population 1.11 and ~100x for 
population 1.12. Genome sequencing data have been deposited in 
the NCBI Read Archive http://www.ncbi.nlm.nih.gov/sra (ac- 
cession no. SRP()33()25). Mutations were identified using the 
BRESEQ_ pipeline with the polymorphism option on (see Table 
S7). The default settings were used except for: a) require a 
minimum coverage of 3 r(-ads on each strand per polymorphism; 
b) eliminate polymorphism predictions occurring in homopoly- 
mers of length greater than 3; c) eliminate polymorphism 



predictions with significant (p-value<0.05) strand or base quality 
score bias. All predicted polymorphisms were manually inspected 
using IGV. 

Test for the advantage of the evolved clones in the 
presence of galactitol 

AU clones that were sequenced carried mutations in the gat 
operon. To test for any fitness advantage of these evolved clones, 
we measured their growth in M9 minimal medium (MM) 
supplemented with glycerol and with glycerol and galactitol, and 
compared growth in both conditions. AU growth curves were 
conducted in 96-weII plates incubated at 37°C with aeration 
(Thermoshaker PHMP-4, Grant). After a first overnight growth in 
MM with glycerol (0.4%) the cultures were normalized to the same 
ODgoo (approximately 0.1) and diluted 100-fold. We used 5 jtl of 
the 10 ^ dilution to inoculate in triplicate wells containing MM 
supplemented either with glycerol (0.4%) or glycerol (0.4%) and 
galactitol (0.4%). Optical density at 600 nm was monitored for 
36 hours using a microplate reader (Victor3, PerkinElmer). The 
results are shown in Figure S2. 

Emergence and dynamics of mutations in the gat operon 

To investigate the dynamics of appearance and expansion of 
beneficial mutations in the gat operon we determined the 
frequency of bacteria unable to metaboKze galactitol (gai-negative 
phenotype) within a given population of the evolution experiment. 

For each population, a sample of the frozen fecal pellets was 
diluted in PBS and plated in Mac Conkey agar supplemented with 
1% of galactitol and streptomycin (100 |J,g/ml); a differential 
medium used to monitor the ability of bacteria to use galactitol. 
Plates were incubated for 20 hours at 28°C. The frequency of 
galactitol mutants for each time point was estimated by counting 
the number of white (galactitol mutants) and red colonies in Mac 
Conkey-galactitol plates. 

Identification of adaptive mutations and estimate of 
haplotype frequencies in populations 1.1, 1.5, 1.11 and 
1.12 during 24 days of adaptation to the mouse gut 

In order to identify the adaptive mutations and estimate the 
haplotype frequencies, between 20 and 40 clones were analyzed 
per time point per population. These clones were obtained by 
plating directiy the frozen fecal samples and thus correspond to 
samples representative of the evolving E. coli population. The 
presence of the adaptive mutations was queried by target PGR. 
The increase in size of the PGR product was indicative of the 
presence of an IS element. The detection of SNPs was done by 
target Sanger sequencing using the primers listed in Table S8. 

PGR reactions were performed in a total volume of 50 [d 
containing 1 |tl bacterial culture, 10 |J,M of each primer (forward 
and reverse), 200 |J,M dNTPs, 1 U Taq polymerase and IX Taq 
polymerase buffer. The PGR reaction conditions were as follows: 
95°G for 3 min foUowed by 34 cycles of 95°G for 30 s, 60°C for 
30 s, 72°C for 2 min, foUowed by 72°C for 5 min. sriR, gatC and 
gat^ were sequenced directly from the PGR products using the 
primers listed in Table S8. 

To detect an insertion of 1 bp in gatC, the gene was PGR 
amplified and then submitted to an enzymatic restriction with 
Mval enzyme. The mutant was identified based on the fragment 
restriction profile. 

The 1 50 Kb duplication was detected by amplification of the 
new junction formed by the tandem duplication. The PGR was 
performed using the same conditions described before and the 
primers listed in Table SB. 
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Direct estimation of the advantage of the gat alleles by 
competition 

To determine the advantage of the gat alleles we performed in 
vivo fitnc'ss assays. The' comp(;tition experiments lasted for three 
days and were performed according to the protocol described 
above for the evolution experiment. 

In the first set of competitions (Figure S4) we competed a clone 
genetically similar to the ancestral but bearing an IS insertion in 
the gat^ gene (sequenced clone 4) against tlie ancestral. In the 
second set of competitions (Figure S6) we competed the gatZ 
mutant against a gatC mutant (derived from sequenced clone 12). 
In addition to a 1 bp insertion in the coding region of the gatC 
gene, this clone originally carried an unstable 150 kb duplication, 
lost during in vitro manipulations to create the mutant gatC. 

Testing for negative frequency dependent selection 

To test for this form of selection we isolated around 30 YFP and 
30 CFP clones from fecal samples after 24 days of colonization of 
population 1.13. We then performed competitive fitness assays in 
vivo. The mixture of the 30 YFP clones was competed against the 
mixture of the 30 CFP clones at three initial ratios (1:1, 10:1 and 
1:100) for 2 days, using the procedure described for the evolution 
experiments. The results are shown in Figure S7. 

Test for increased mutation rate 

To test for the possible emergence of mutators during adaptation 
to the gut we determined the frequency of rifampicin-resistant 
mutants, in each of the evolved populations. We grew pre-cultures 
of the evolved populations by inoculating in 10 ml of LB a sample of 
each population from the last day of the experiments (approximately 
400 generations of gut colonization). Pre-cultures were grown 
overnight at 37°C with aeration in the presence of ampicillin 
(100 (ig/ml) and streptomycin (100 Hg/ml). The pre-cultures were 
then diluted and approximately 1000 cells (10 ^1 of a 10'* dilution) 
were inoculated in triplicate in 1 ml of LB and incubated overnight. 
Aliquots of each tube were plated in LB agar and LB agar 
supplemented with rifampicin (100 |.lg/ml) and incubated overnight 
at 37°C. The frequency of mutation to rifampicin resistance was 
calculated as the ratio between the number of rifampicin resistant 
mutants and the total number of individuals in each population. 

Supporting Information 

Figure SI Colonization of the mouse gut by Escherichia coli. 
Bacterial loads per gram of feces (with 95% confidence intervals) 
during 24 days of adaptation of E. coli to the mouse gut 
(populations 1.1 to 1.15). 
(TIF) 

Figure S2 Comparison of growth curves of ancestral strain and 
evolved clones. Growth curves of ancestral (black) and evolved 
clones (grey) in MM with glycerol (A) and MM with glycerol and 
galactitol (B). Error bars represent the standard error of the mean 
of three independent measurements. 
(TIF) 

Figure S3 Emergence and spread of beneficial mutations in the 

gat operon. Dynamics of frequency change of the ,^aZ-negative 
phenotype (blue squares) and of the neutral fluorescent marker 
(orange diamonds) are shown for representative examples of 
populations where increase in frequency and eventual fixation of 
the ^fl^-negative phenotype was accompanied by strong divergence 
of the fluorescent marker ((A) populations 1.5, 1.12 and 1.13) or 
not ((B) 1.1, 1.4 and 1.11). 
(TIF) 



Figure S4 Direct estimate of the selective advantage of a gat 
allele over the ancestral in the mouse gut. Results of three 
independent competition experiments between gat^ mutant and 
the ancestral. We estimated a selective advantage of gat^ 
(corresponding to the slope (±2 s.e.m) of the linear regression of 
\n{gatZ^ axic) along the three days of competition) of 0.065 
(±0.013), R2 = 0.99 over die ancestral. 
(TIF) 

Figure S5 Sums of frequencies of genotypes at diflFerent loci 
along time for the populations represented in Figure 5. As in 
Figure 5 these frequencies are represented along 24 days 
(corresponding to 432 generations) of evolution inside the mouse 
gut (see Tables S3 to S6 for numeric data). A corresponds to 
population 1.1, B to 1.11, C to 1.12 and D to 1.5. 
(TIF) 

Figure S6 Direct estimate of the selective advantage of different 
gat alleles in the mouse gut. Results of three independent 
competition experiments between gatZ and gatC mutants. We 
estimated a selective advantage oigatZ (corresponding to the slope 

(±2 s.e.m) of the linear regression \n(gatZ/ gatQ along the three 
days of competition) of 0.005 (±0.016), R2 = 0.30 over gatC. 
(TIF) 

Figure S7 Test for negative frequency-dependent selection of 
evolved clones. 30 CFP and 30 YFP clones isolated from population 
1.13 after 24 days of colonization were competed in vivo at initial ratios 
CFP: YFP of 1 : 1 (open circles), 1:10 (open diamonds) and 1 00: 1 (open 
triangles) for 3 days (corresponding to approximately 54 generations). 
Three independent competition experiments were performed for 
each ratio. The selective advantages per generation of the CFP in 
relation to the YFP population were calculated for each ratio of CFP 
over YFP. These are based on the slopes of the linear regression of 
ln(CFP/YFP) along time. The slopes (±2 s.e.m) are: 0.11 (±0.04), 
R2 = 0.95for 1:1 (dotted fine); 0.10 (±0.02), R2 = 0.99 for l:10(solid 
line) and 0.06 (±0.05), R2 = 0.77 for 100:l(dashed line). A selective 
advantage was found for all the cases tested irrespective of their initial 
frequency, indicating that negative frequency-dependent selection is 
not the major process underling adaptation in the mouse gut. 
(TIF) 

Figure S8 Fluctuation tests of evolved clones to test for the 
emergence of mutator alleles during adaptation to the gut. Each 
symbol represents an independent measurement of the frequency of 
spontaneous mutants resistant to rifampicin. Measurements of the 
mutation frequency were performed for each of the 1 5 mice evolved 
populations (1.1 to 1.15) as well as for the respective ancestors. No 
signiflcant difference was found between the estimations of the 
mutation frequencies for the ancestors and evolved populations 
(Mann-Whitney test with Bonferroni correction, P>0.05). 
(TIF) 

Table SI Mutations identified in the genomes of the ancestral 

strain. 

(DOC) 

Table S2 The number and nature of adaptive events across 

independentiy evolved clones. 

(DOC) 

Table S3 Frequencies of newly generated haplotypes along 24 
days of evolution of population 1 . 1 inside the mouse gut. 
(DOCX) 

Table S4 Frequencies of newly generated haplotypes along 24 
days of evolution of population 1.5 inside the mouse gut. 
(DOCX) 
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Table S5 Frequencies of newly generated haplotypes along 24 
days of evolution of population 1 . II inside the mouse gut. 
(DOCX) 

Table S6 Frequencies of newly generated haplotypes along 432 
generations of evolution of population 1.12 inside the mouse gut. 
(DOCX) 

Table S7 The number and nature of adaptive events across 

independendy evolved populations. 

(DOCX) 

Table S8 Oligonucleotide primers used in this work. 
(DOCX) 
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